Association Rule Hiding Based on Evolutionary Multi-Objective Optimization by Removing Items
نویسندگان
چکیده
Today, people benefit from utilizing data mining technologies, such as association rule mining methods, to find valuable knowledge residing in a large amount of data. However, they also face the risk of exposing sensitive or confidential information, when data is shared among different organizations. Thus, a question arises: how can we prevent that sensitive knowledge is discovered, while ensuring that ordinary non-sensitive knowledge can be mined to the maximum extent possible. In this paper, we address the problem of privacy preserving in association rule mining from the perspective of multi-objective optimization. A new hiding method based on evolutionary multi-objective optimization (EMO) is proposed and the side effects generated by the hiding process are formulated as optimization goals. EMO is used to find candidate transactions to modify so that side effects are minimized. Comparative experiments with exact methods on real datasets demonstrated that the proposed method can hide sensitive rules with fewer side effects. Introduction Data often need to be shared among different organizations during business collaboration in order to gain more reciprocal interests. People can utilize data mining techniques to extract useful knowledge from the shared large data collection. However, despite its benefits to business decision making, data mining technology could also pose the threat of disclosing sensitive knowledge to other parties. To address this issue, a feasible solution is to modify the original database in some way so that the sensitive knowledge can not be mined out. In this paper, we focus on privacy preserving in association rule mining. Modification could lead to non-sensitive rules also to be concealed. The challenge is how to hide the sensitive rules while the non-sensitive ones still can be mined out in the modified database to the largest extent possible. Atallah et al. (Atallah et al. 1999) first proposed the protection algorithm for data sanitization and proved the optimal solution to this problem is NP-hard. Dasseni (Dasseni et al. 2001) and Verikios (Verikios et al. 2004) extended the itemset hiding to association rules and proposed three heuristic hiding approaches, i.e., algorithm 1.a, 1.b, 2.a and 2.b. These approaches hide sensitive rules by deleting or Copyright © 2014, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. inserting items to decrease the supports or confidences of sensitive rules below the specified thresholds. Amiri (Amiri 2007) proposed heuristic algorithms to hide itemset (not rules) by removing transactions or items, in terms of the number of sensitive and non-sensitive itemsets related. Although the rule hiding problem well beholds the characteristic of multi-objective optimization, as far as we know, there is no related work to solve this problem from a multiobjective optimization point view. In view of this, we adopted the evolutionary multiobjective optimization (EMO) algorithm to solve this problem. The side effects were formulated as optimization goals to be minimized. The model we adopted to modify database and hide rules was to remove selected items in identified transactions which support sensitive rules, so that sensitive rules could escape the mining in the modified database at some predefined thresholds. The main contribution of this paper is as follows. First we took the rule hiding problem as a multi-objective optimization process and adopted the EMO method to solve it for the first time. Secondly, compared with deterministic methods, the proposed hiding approach based on EMO can hide all sensitive rules with fewer side effects in most cases at the cost of more running time.
منابع مشابه
Use HypE to Hide Association Rules by Adding Items
During business collaboration, partners may benefit through sharing data. People may use data mining tools to discover useful relationships from shared data. However, some relationships are sensitive to the data owners and they hope to conceal them before sharing. In this paper, we address this problem in forms of association rule hiding. A hiding method based on evolutionary multi-objective op...
متن کاملMulti-objective Association Rule Mining using Evolutionary Algorithm
Generally association rule mining (ARM) algorithms, like the apriori algorithm, initial produce frequent itemsets and afterward, from the frequent itemsets, the association rules that go beyond the minimum confidence threshold. When the data is in large volume, it takes number of scans to generate frequent items.It is a better idea if all the association rules generated directly without generat...
متن کاملMulti-objective optimization design of plate-fin heat sinks using an Evolutionary Algorithm Based On Decomposition
This article has no abstract.
متن کاملFuzzy Apriori Rule Extraction Using Multi-Objective Particle Swarm Optimization: The Case of Credit Scoring
There are many methods introduced to solve the credit scoring problem such as support vector machines, neural networks and rule based classifiers. Rule bases are more favourite in credit decision making because of their ability to explicitly distinguish between good and bad applicants.In this paper multi-objective particle swarm is applied to optimize fuzzy apriori rule base in credit scoring. ...
متن کاملFuzzy Apriori Rule Extraction Using Multi-Objective Particle Swarm Optimization: The Case of Credit Scoring
There are many methods introduced to solve the credit scoring problem such as support vector machines, neural networks and rule based classifiers. Rule bases are more favourite in credit decision making because of their ability to explicitly distinguish between good and bad applicants.In this paper multi-objective particle swarm is applied to optimize fuzzy apriori rule base in credit scoring. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Intell. Data Anal.
دوره 20 شماره
صفحات -
تاریخ انتشار 2014